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ABSTRACT 

Accurately predicting how the cosmic abundance of neutral hydrogen evolves with 
redshift is a challenging problem facing modellers of galaxy formation. We investigate 
the predictions of four currently favoured semi-analytical galaxy formation models ap- 
plied to the Millennium simulation for the mass function of cold neutral gas (atomic 
and molecular) in galaxies as a function of redshift, and we use these predictions 
to construct number counts for the next generation of all-sky neutral atomic hydro- 
gen (HI) surveys. Despite the different implementations of the physical ingredients of 
galaxy formation, we find that the model predictions are broadly consistent with one 
another; the key differences reflect how the models treat AGN feedback and how the 
timescale for star formation evolves with redshift. The models produce mass functions 
of cold gas in galaxies that are generally in good agreement with HI surveys at z=0. 
Interestingly we find that these mass functions do not evolve significantly with red- 
shift. Adopting a simple conversion factor for cold gas mass to HI mass that we apply 
to all galaxies at all redshifts, we derive mass functions of HI in galaxies from the 
predicted mass functions of cold gas, which we use to predict the number counts of 
sources likely to be detected by HI surveys on next generation radio telescopes such as 
the Square Kilometre Array and its pathfinders. We find the number counts peak at 
~ 4x 10 3 /4 x 10 4 /3 x 10 5 galaxies per square degree at z~ 0.1/0.2/0.5 for a year long 
HI hemispheric survey on a 1%/10%/100% SKA with a 30 square degree field of view, 
corresponding to an integration time of 12 hours. On a full SKA with a 200 square 
degree field of view (equivalent to an integration time of 80 hours) the number counts 
peak at 5 x 10 5 galaxies per square degree at z ~ 0.6. We show also how adopting 
a conversion factor for cold gas mass to HI mass that varies from galaxy to galaxy 
impacts on number counts. In addition, we examine how the typical angular sizes of 
galaxies vary with redshift. These decline strongly with increasing redshift at £<;0.5 
and more gently at z ^ 0.5; the median angular size varies between 5" and 10" at 
z=0.1, 0.5" and 3" at z—1 and 0.2" and 1" at z=3 for galaxies with HI masses in 
excess of 10 9 /i~ 1 M Q , depending on the precise model. Taken together, these results 
make clear that forthcoming HI surveys will provide important and powerful tests of 
theoretical galaxy formation models. 
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1 INTRODUCTION 

Neutral gas, predominantly atomic hydrogen (HI) along 
with molecular hydrogen (H2) and helium (He), plays a fun- 
damental role in galaxy formation, principally as the raw 
material from which stars are made. At any given time the 



fraction of a galaxy's mass that is in the form of HI will be 
determined by the competing rates at which it is depleted 
(by, for example, star formation, photo-ionisation and expul- 
sion via winds) and replenished (by, for example, recombi- 
nation and accretion from the galaxy's surroundings). These 
processes are integral to any theory of galaxy formation and 
so we expect that understanding how the HI properties of 
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galaxies vary with redshift and environment will provide us 
with important insights into how galaxies form. 

Our knowledge of HI in galaxies derives from radio ob- 
servations of the rest-frame 21-cm emission line, which al- 
lows us to measure the density, temperature and velocity 
distribution of HI along our line of sight. Than ks to surveys 
such as HIPASS (HI Parkes All-Sky Survey; see lMever et all 
l2004l and more r ecently ALFALFA (Ar ecibo Legacy Fast 
ALFA Survey; see iGiovanelli et "all 120051) . we have a good 
understanding of the HI properties of galaxies in the low- 
redshift Universe, at z < 0.05. For example, HIPASS has re- 
vealed that most HI is associated with galaxies and that the 
galaxy population detected in 21-cm emission is essentially 
the same as that seen at optical and infrared wavelengths but 
weigh ted towards gas-ric h systems, which tend to be late- 
type ijMever et all 12004 ). Furthermore, HIPASS data have 
allowed accurate measurement of the local HI mass func- 
tion of galaxies (the number density of galaxies with a given 
HI mass per unit comoving volume) and Qhi (the global 
HI mass density in units of the presen t-day critical density 
Pcrit = 3Hn /8wG). IZwaan et ail l|2005h found that the local 
HI mass function is well described by a Schechter function 
and they estimated the local cosmic HI mass density to be 
Qm = 3.5 ± 0.4 ± 0.4 x 10 -4 (with random and system- 
atic uncertainties at the 68% confidence limit), assuming a 
dimensionless Hubble parameter of /i75=/i/0.75=l. This is 
approximately l/10 th the value of c osmic stellar mass den- 
sity tt, at 2=0 (cf. ICole et alJboOlf ) 

In contrast, we know comparatively little about HI 
in galaxies at higher redshifts (i.e. z > 0.05). This is 
because detecting the rest frame 21-cm emission from 
individual galaxies has required too great a sensitivity for 
reasonable observing time^J. There are estimates of fim at 
high redshifts (z > 1.5) but these have been deduced from 
QSO absorption- line systems and imply that f ^ ni — 10~ 3 
(e.g. iPeroux et abllitX)! iProchaska et all [2005; Rao et al.l 
l2006h . However, this situation will change dramatically 
over the next decade with the emergence of a series of 
next generation radio telescopes, culminating in the Square 
Kilometre Array (SKA) that is expected to see first light 
by about 2020. The SKA will have sufficient sensitivity and 
angular resolution to ma p HI in galaxies out to redshift s 
2 > 3 (see, for example Blake et all [20041; iBraunl [2007b . 
On a shorter timescale a variety of SKA pathfinders such 
as ASKAP (the Australian SKA Pathfinder; sec Johnston 
et al. |2008jh . MeerKAT (the Karoo Array Telescope; see 
Jonas! 120071) and APE RTIF (APERture Tile In Focus; see 
Verheiien et alj l2008h will carry out HI surveys that, in 
some cases, will probe the properties of galaxies out to z ~ 1. 

Because we have so little data on HI in galaxies be- 
yond the local Universe, the results of HI surveys on next 
generation radio telescopes will have a profound impact on 
our understanding of galaxy formation and evolution. For 
example, we have compelling evidence that the cosmic star 
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formation rate d ensity has decreased by an order o f magni- 
tude since z ~ 1 l|Madau et al.lll996l ; lHopkinsll2004l '). yet we 
know little about how the HI mass in galaxies evolved over 
the same period. This is precisely the kind of question we 
can hope to answer with forthcoming HI surveys. For this 
reason it is both timely and important to take stock of what 
theoretical galaxy formation models tell us about how quan- 
tities such as Qhi and the HI mass function vary with red- 
shift and environment. Not only can such predictions help us 
to interpret the physical significance of observational data, 
they can also provide important input into the design of new 
radio telescopes. 

The primary aim of this paper is to explore the predic- 
tions of cur rently favoured semi-an alytical galaxy formation 
models (cf. I Cole et al . 2000; Baugh 200g) for the properties 
of cold gas in galaxies as a function of redshift in a A Cold 
Dark Matter (ACDM) universe. In particular we investigate 
the redshift variation of the cold gas mass function of galax- 
ies and ficoid (i.e. the cold gas mass density p arameter) in 
the Millennium simulation (jSpringel et al.ll2005f ). In addition 
we convert the predicted cold gas masses to HI masses and 
we use the resulting HI mass functions, along with predic- 
tions for the radii and rotation speeds of galactic discs, to 
predict the number counts of sources one might expect to re- 
cover from HI surveys on a radio telescope with a collecting 
area of 1%/10%/100% of the full SKA. 

The secondary aim of this paper is to compare and 
contrast the predictions of four distinct galaxy formation 
models from the Durham and Munich groups. These 
models incorporate different treatments of the same phys- 
ical processes and in some cases invoke distinct physical 
processes, for example, AGN heating versus supernova- 
driven super-winds. Therefore it is instructive to assess the 
robustness of the basic predictions and to examine whether 
or not these predictions are consistent between models. 
This addresses the criticism that semi-analytical galaxy 
formation models lack transparency and the uncertainty 
as to which predictions are robust and which are sensitive 
to modest changes in the model parameters. It is in this 
context that this work complements i n a very natural way 
the study of lObreschkow et alj (|2009af ). 



The layout of the paper is as follows. In Sj2]we present an 
overview of the four galaxy formation models we use in this 
study. In J|]we examine the basic predictions of these mod- 
els for the evolution of the mass function and global mass 
density of cold gas between <C z <C 2, and we determine the 
relationship between cold gas mass and the circular veloci- 
ties and scale radii of discs. Based on these predictions, in 
Sj4]we determine what the implications are for the number 
counts of HI sources in future HI surveys on next genera- 
tion radio telescopes, discussing in some detail how we might 
convert from cold gas mass to HI mass in §4.1\ Finally, in fj5] 
we summarise our results and discuss how future HI surveys 
will provide a powerful test of theoretical models of galaxy 
formation. 
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2 GALAXY FORMATION MODELS 

The evolution of the global cold gas density in the Universe 
and the cold gas content of galaxies depends upon the in- 
terplay between a number of processes: 

(i) the rate at which gas cools radiatively within dark 
matter haloes; 

(ii) the rate at which cold gas is accreted in galaxy merg- 
ers; 

(iii) the rate at which cold gas is consumed in star forma- 
tion; 

(iv) the rate at which gas is reheated or expelled from 
galaxies by sources of feedback (e.g. photo-ionisation, stellar 
winds, supernovae, AGN heating, etc.). 

Semi-analytical modelling provides us with the means to 
study the balance between these phenomena in the context 
of a universe in which structure in the dark matter grows 
hierarchically (for a recent review, see iBaughl [2006) . The 
models refer to cold gas as gas that has cooled radiatively 
from a hot phase to below 10 4 K and is available for star 
formation. The cold gas mass is predominately made up of 
neutral atomic hydrogen (HI), along with molecular hydro- 
gen and helium (see discussion in H4.ip . 

Here we consider four different semi-analytical galaxy 
formation models from the Durham and Munich groups. 
Although the models follow the same basic philosophy, the 
implementations of various processes differ substantially be- 
tween the two groups. We also consider Durham models with 
different physical ingredients. To remove one possible source 
of difference between the models, all the models discussed 
here adopt the background cosmology used in the Millen- 
nium simul ation (f2M=0-25, S7a =0.75, Qh = 0.045, crg=0.9, 
/i=0.73; cf. ISpringel et al.ll2005l ). 



2.1 The Models 

We now list the different models considered in this paper, 
give their designation and a very brief description of the 
main features of each. The differences between the mod- 
els are discussed in more detail later on in this section. 
The first three models are "Durham" models, which use 
the GALFORM code, and the fourth model is the current 
"Munich" semi-analytical model. The model designations 
are those used in the Millennium Archived an d in the sub- 
sequent plots. 



• The lBower et al.l (|2006h model (hereafter Bower2006a). 
In this model, AGN heating suppresses the formation of 
bright, massive galaxies by stopping the cooling flow in their 
host dark matter haloes, thereby cutting off the supply of 
cold gas for star formation. This regulation of the cooling 
flow results in a sharp break at the bright end of the luminos- 
ity function. Bower2006a matches the evolutio n of the stellar 
mass function inferred fro m observations (e.g. lFontana et all 
12004 iDrorv et al] 12005). the number counts and redshift 
distribution of extremely red objects (|Gonzalez-Perez et al.l 



2 The Millennium galaxy archive can be found at Durham 
(http://galaxy-catalogue.dur. ac.uk:8080/Millennium) or Munich 



2009) and the abundance of luminous red galaxies ( Almeida 
et al. l200Sl) 

• The iFont et al.1 (|2008h model (hereafter Font2008a). 
This model extends Bower2006a with a fundamental change 
to the cooling model. Motivated by the simulations of Mc- 
Carthy et al (|2008l ). which track the fate of the hot gas 
in haloes after their accretion by more massive objects, 
Font2008a assumes that the stripping of hot gas from satel- 
lite haloes is not completely efficient, contrary to the tra- 
ditional recipe used in semi-analytical models. Instead, the 
satellite halo is assumed to retain some fraction of its hot 
gas, which is determined by its orbit within the larger halo. 
This gas can cool directly onto the satellite rather than the 
central galaxy in the halo. Font2008a gives an improved 
match to the proportions of red and blue galaxies seen in 
SDSS grou ps dWeinmann et al1l2006allbl) . 



(http: / /www. g- vo.org/Millennium) 



• The iBaugh et al.1 (|2005l) model (hereafter 
Baugh2005M). Baugh2005M matches the observed counts 
and redshifts of sub-mm galaxies and the luminosity 
function of Lyman-break galaxies, as well as observations 
of the lo cal galaxy population , such as the sizes of galaxy 
discs (cf. iGonzalez et alj 2009) and cold gas mass fractions. 
In this model, merger-triggered star-bursts make a similar 
contribution to the star formation rate per unit volume 
at high redshift to that from galactic discs. Star-bursts 
are assumed to have a top-heavy stellar initial mass 
function (IMF), which Baugh et al. argued is essential for a 
hierarchical galaxy formation model to match the sub-mm 
counts, whilst at the same time reproducing observations of 
local galaxies. The formation of bright galaxies is regulated 
by a supernova driven "super- winds" , which expel gas from 
inter mediate mass dark matter haloes (see iBenson et al.l 
l2003h . 

In this paper we implement Baugh2005M in the Millen- 
nium simulation. The cosmology used in the Millennium 
simulation is somewhat different to that adopted in the 
original Baugh et al. model (the former has a matter den- 
sity of J1m = 0.25, a dimensionless Hubble parameter of 
h = 0.73 and a power spectrum normalisation of erg = 0.9, 
whereas the latter used J1m = 0.3, h = 0.7 and erg = 0.93). 
To reproduce the predictions of Baugh et al., we retain 
the baryon fraction fib/f^m of the original model, setting 
ttb ~ 0.033. The other galaxy formation parameters have 
not been changed. This model is not available in the Millen- 
nium Ar chive. 

• The lDe Lucia fc Blaizotl (|2007f) model (hereafter DeLu- 
cia2006a). DeLucia2006a employs AGN feedback in the 
"radio-mode" to restrict the formation of bright galaxies at 
the prese nt day. This m o del is a dev elopment of those intro - 
duced by ICroton et al.l (|2006h and |Pe Lucia et al.l l|2006|) . 
It enjoys many of the same successes as Bower2006a, but, 
if anything, produces too many stars at high redshift (cf. 
iKitzbichler fc Whitdl2007l) 

2.2 Halo Identification and Merger Trees 

All of the models use halo merger histories extracted from 
the Millennium simulation, derived from ide ntical group cat- 
alogue s produced by the SubFind code of ISpringel et al.l 
(2001). SubFind identifies distinct groups of p articles using 
the " friends-of-friends" (FOF) algorithm (cf. iDavis et al.l 
1985) and then resolves each FOF group into self-bound 
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overdensities. These self-bound overdensities correspond to 
subhaloes; the most massive subhalo within a FOF group is 
identified with the host dark matter halo while the remain- 
ing lower mass subhaloes within the group correspond to its 
substructures. There are as many group catalogues as there 
are output times and they are linked across multiple output 
times to produce merger trees. 

Although the input group catalogues are identical, the 
Durham and Munich groups construct their merger trees in- 
dependently using distinct algorithms. This means that it is 
possible to identify the same haloes at a given output time 
in the Durham and Munich models but the detailed merging 
histories of these haloes may differ. This difference reflects 
in part differences in the working definition of a halo. The 
Munich models track the set of particles that correspond 
to the most mas sive subhalo within the FOF group across 
output times fcf. ICroton et al1l2006h . whereas the Durham 
models track the set of part icles that correspond - in gen- 
eral - to the FOF group (cf. iHellv et alj|2003l ; lHarker et al 
2006). In general, because these FOF groups are modified 
to avoid situations where the groups become prematurely 
or temporarily linked by bridges of low-density material (cf. 
lHarker et alll2006l ; ICole et al-lBool ) . 

The difference also reflects how the host subhaloes of 
satellites are treated. In the Durham models, an infalling 
subhalo is considered a satellite galaxy of its more massive 
host halo once it loses in excess of 25% of the mass it had 
at the time of its accretion and it lies within twice its host 
halo's half-mass radius (cf. lHarker et all|2006l ). Importantly, 
this subhalo is then treated as a satellite at all subsequent 
times, even if its orbit brings it outside of its host's virial 
radius at some later time. In contrast, the Munich models 
simply require that the infalling satellite lies within the virial 
radius of its host; if the subhalo's orbit takes it beyond the 
virial radius, it is no longer classed as a satellite (John Helly, 
private communication). Also, the Durham models require 
that the masses of subhaloes must increase monotonically 
with time, whereas the Munich models allow a subhalo's 
mass to either increase and decrease with time. Finally, the 
Durham models explicitly tag subhaloes that are satellites 
and treat them accordingly, whereas the Munich models do 
not treat subhaloes that host satellites any differently from 
subhaloes that do not. 



2.3 Galaxy Formation Physics 

We now highlight some of the areas in which there are either 
important differences in the implementation of the physics 
between the models or in which different processes have been 
adopted. For full descriptions of each model we refer the 
reader to the original references given above. A comparison 
of Bow er2006a and Baugh2005M is given in I Almeida et al.l 
(|2007h : the differences between Bower2006a and Font2008a 
are set out in lFont et al] (|2008l ). 

(1) Gas cooling: gas density and cooling radius. The 
models all assume that gas cools primarily (for the haloes 
which typically host galaxies) by two-body collisional pro- 
cesses involving neutral or ionised atoms. The cooling rate 
depends upon the composition (metallicity) and the density 
of the gas. Gas is assumed to have cooled within some cool- 
ing radius, which is defined in different ways in the models. 



A further timescale that regulates the addition of cold gas 
into a galactic disc is the free-fall time. 

The models make different assumptions about the den- 
sity profile of the gas and the cooling radius: 

(i) DeLucia2006a assu mes that the gas f ollows a singular 
isothermal profile (see ICroton et al.ll2006h . and the cooling 
radius is defined as the radius at which the cooling time is 
equal to the dynamical time of the halo. 

(ii) Both Bower2006a and Font2008a assume that the hot 
gas density profile follows an isothermal profile with a con- 
stant density core, whose core radius is fixed and scales with 
the virial radius of the halo. The cooling radius propagates 
outwards as a function of time, reaching a maximum at the 
radius where the cooli ng time is equal t o the lifetime of the 
dark matter halo (see ICole et al.ll2000l) . In Font2008a, the 
cold gas yield is a factor of two higher than that adopted in 
Bower2 006a, which gives a be tter match to observed galaxy 
colours (|Gonzalez et al.ll2009l) . 

(iii) Baugh2005M assumes that the hot gas follows an 
isothermal profile with a constant density core, whose core 
radius evolves with time as low cntropv sas cools (sec Cole 
et al. l2000h . The cooling radius is defined in the same way 
as in Bower2006a and Font 2008a. 

(2) Gas cooling: AGN heating of the hot halo. 
Bower2006a, Font2008a and DeLucia2006a all modify the 
cooling flow in massive haloes by appealing to heating from 
radio-mode AGN feedback, following the accretion of mate- 
rial from the cooling flow onto a central supermassive black 
hole. 

(3) Gas cooling: halo baryon fraction. In Baugh2005M 
there is no heating of the hot halo by AGN feedback. In- 
stead, a new channel is introduced for gas heated by the 
energy released by supernova explosions. Some fraction of 
the gas, as is common in the majority of semi-analytical 
models, is reheated and re-incorporated, on some timescale, 
into the hot gas halo (see Benson et al. 2003). The rest of 
the reheated gas is ejected from the halo altogether in the 
superwind. In the Baugh2005 model, this gas is not allowed 
to re-cool at any stage. This process becomes inefficient in 
more massive haloes. However, the cooling rate is reduced in 
such haloes because they have less than the universal frac- 
tion of baryons (due to superwind ejection of gas from their 
progenitors). A detailed description of how superw i nds ar e 
modelled in Baugh2005M is given in lLacev et ail (|2008^ . 
who looked at the properties of galaxies in the infra-red for 
comparison with observational data from the Spitzer space 
telescope. 

(4) Gas cooling: cooling in satellites. Font2008a intro- 
duced a new cooling scenario based on the hydrodynamical 
simulations of McCarthy et al. (2008) . Traditionally, the ram 
pressure stripping of the hot gas from a satellite halo has 
been assumed to be maximally efficient and instantaneous 
following a merger between two dark matter haloes. Mc- 
Carthy et al. showed that in gas simulations this is not the 
case and that the satellite can retain a substantial amount 
of hot gas, with the fraction depending upon the satellite 
orbit. McCarthy et al. used a suite of simulations to cali- 
brate a recipe to describe how much hot gas is kept. Font 
et al. (2008) extended the GALFORM code to include this 
prescription to calculate the amount of hot gas attached to 
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each satellite galaxy within a halo and to allow the gas to 
cool directly onto the satellite, rather than onto the central 
(most massive) galaxy in the main dark matter halo. The 
other models considered in this paper do not allow gas to 
cool onto satellite galaxies. 

(5) Galaxy mergers. Galaxies merge due to dynami- 
cal friction. Baugh2005M a dopts the form o f the merger 
timescale given by Eq. 4.16 of lCole et all (|2000h . Bower2006a 
and Font2008a use the same prescription with a timescale 
that is longer by 50%; physically this can be explained as a 
reduction in the mass of the satellite halo due to tidal strip- 
ping. DeLucia2006a use a hybrid scheme in which a satellite 
galaxy is associated with a substructure halo, which is fol- 
lowed until stripping and disruption result in it dropping 
below the resolution limit of the simulation. From the last 
radius at which the substructure was seen, an analytic esti- 
mate of the merger time is made, using the dynamical fric- 
tion timescale (but with a different definition of the Coulomb 
logarithm) and applying a boost of a factor of 2 (to improve 
the match to the bright end of the present day optical lumi- 
nosity function). 

(6) Star formation. In the Durham models, the star for- 
mation timescale scales according to circular velocity of the 
disc computed at the half-mass radius. The star formation 
timescale also depends on a timescale parameter which can 
be held fixed (Baugh2005M) or which can scale with the disc 
dynamical time (Bower2006a, Font2008a). All of the cold gas 
is available for star formation. In the Munich model (DeLu- 
cia2006a), a critical mass of cold gas has to be reached before 
star formation can begin. This is motivated by the observa- 
tional inference that star formation requires a critical surface 
de nsity of cold gas (which also has a theoretical motivation; 
cf. iKennicuttl [1998). Only the cold gas mass in excess of 
the threshold is available for star formation. The timescale 
adopted is the disc dynamical time. All the models adopt 
a standard solar neighbourhood initial mass function (IMF) 
for quiescent star formation in discs, although Baugh2005M 
adopts a top-heavy IMF with a correspondingly higher yield 
and recycled fraction for episodes of star formation triggered 
by galaxy mergers. 

(7) Heating of cooled gas by supernovae. In the Durham 
models, the amount of gas reheated by supernova feedback 
per timestep is a multiple of the star formation rate, which 
depends principally on the circular velocity of the disc 
as well as the choice of values adopted for the feedback 
parameters. As we have touched upon in (3) above, in 
Baugh2005M the gas reheated by supernovae can either be 
ejected completely in a superwind or heated up so that it 
is later re-incorporated into the hot halo (when a new halo 
forms i.e. after a doubling of the halo mass). Bower2006a 
and Font2008a do not consider the superwind channel 
for reheated gas. These models allow the reheated gas to 
be added to the hot gas reservoir on a timescale which 
depends on the halo dynamical time, rather than waiting 
for a new halo to form. DeLucia2006a follows ICroton et al.l 
(2006), who globally pins the rate at which gas is reheated 
by supernovae to a multiple of the star formation rate 
suggested by observations. The amount of energy released 
by supernovae is tracked and used to compute if any of the 
hot halo is ejected, to be re-incorporated on some timescale. 



paper is that they contain parameters that are set by re- 
quiring that the predictions reproduce a subset of the avail- 
able observational data. The primary consideration when 
setting the model parameters is that the model reproduces 
the present-day optical luminosity function as closely as pos- 
sible. However, this alone is insufficient to set all of param- 
eters, and so selected secondary observations are matched 
in order to specify the model. For example, the observed 
gas fractions in spirals and the sizes of discs are used to 
determine the Baugh2005M parameters, while Bower2006a 
focuses on reproducing the bimodality of the colo ur distri- 
bution of local galaxies (cf. iGonzalez et alj|2009l) . In both 
cases, constraining the parameters in this way fixes the star 
formation timescales in the models. We refer the reader to 
the original references for a more complete discussion of 
which datasets are reproduced by the respective models. 

It is worth noting that, in the context of this study, gas 
fractions in spirals are the only data used to set parameters 
which explicitly relate to the cold gas content of galaxie^f]; 
other observations, such as the galaxy luminosity and mass 
functions, provide indirect constraints on the cold gas con- 
tent. None of the model parameters have been adjusted for 
the purposes of this paper, except for the reduction in the 
cosmological baryon fraction in Baugh2005M, as explained 
above. 



3 BASIC PREDICTIONS 

In this Section we present the model predictions for the 
cold gas masses, radii and rotation speeds of galactic discs. 
These quantities are used in the next section to predict 
the 21cm luminosity of the galaxies. Note that we do not 
discuss any quantities derived from these direct model 
outputs here, instead deferring such discussion until Sj4] 

We begin by inspecting the cold gas mass functions pre- 
dicted by the four models in Fig. [T] at z=0 (upper panel) 
and z=l (lower panel). For comparison, we show also an 
"observed" z~0 cold gas mass function (open circles and 
error bars), obtained by convertin g the z=0 mass fun ction 
of HI in galaxies from HIPASS (cf. IZwaan et ai1l2005[) to a 
cold gas mass function, using the "fixed H2/HI ratio" con- 
version factor discussed in H4.ll The reader should note that 
cold gas masses are plotted in units of h~ 2 Mq rather than 
h' 1 Mq, which is the unit used in simulations. This ensures 
that the observational units (which depend upon the square 
of the luminosity distance) are matched, but it introduces an 
explicit dependence on the dimensionless Hubble parameter 
h; here we adopt /i=0.73, the value used in the Millennium 
simulation. 

We find that DeLucia2006a and Baugh2005M recover 
the observed z~0 cold gas mass function reasonably well, fol- 
lowing the data closely between M co id — 10 8 ' 5 /i _1 Mq (ap- 
proximately the cold gas mass resolution limit of the model; 
see below) and M co id — 10 9 ' 8 /i _1 Mq); at larger M co id, 
both models tend to overestimate the amount of cold gas 
in galaxies by ~ 0.25 dex. In contrast, both Font2008a and 



A common feature of all the models presented in this 



3 Even then, these data were used in only a subset of the models. 
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Figure 1. The predicted cold gas mass function at z = (top) 
and 2 = 1 (bottom). The points show an observational estimate 
of the HI mass function at z=0 by Zwaan et al. (2005), con- 
verted into a cold gas mass function by adopting the fixed H2 /HI 
ratio conversion factor described in £14.11 Different lines types 
correspond to different models as indicated by the legend. For 
Baugh2005M, the results using the Millennium simulation merger 
trees are shown by the dashed black lines. The dotted and dot- 
dashed lines show calculations using Monte Carlo merger trees 
with improved mass resolution (with a mass resolution a factor 
of 2 and 4 better than the Millennium simulation respectively) 
but with the galaxy formation parameters held the same. The 
dotted vertical lines indicate the cold gas mass resolution limit 
of the Millennium galaxy formation models. The cold gas mass 
resolution limit is slightly higher at z = 1 than it is at z = 0. In 
the lower panel, the z = data points are repeated for reference. 



Bower2006a predict systematically more cold gas in galax- 
ies than is observed. This is unsurprising, however, because 
both Font2008a and Bower2006a also over-pred ict the gas- 
to-ste llar ratio in spirals (see the discussion in ICole et al.l 
|2000| and their figure 9). 

There is a minimum mass below which haloes are not 
reliably resolved in the Millennium simulation and this in 
turn imposes a minimum cold gas mass below which the pre- 
dictions of the galaxy formation models are unreliable. This 
arises because the simulation can only recover the abun- 
dance of dark matter haloes down to some limiting mas^j; 
below this limiting mass, the abundance of low-mass haloes 
will be suppressed because of finite mass resolution of the 
simulation. Furthermore, low-mass haloes may not be suffi- 
ciently well resolved for their merger trees to be considered 
reliable; this mass is likely to be larger than required for 
convergence of the halo mass function. 

This limiting halo mass is a problem because we expect 
cold gas to be present in haloes with masses below the 
resolution limit, and so we need to know how the limiting 
halo mass and the minimum cold gas mass relate to one 
another. This relationship can be estimated by running 
Monte Carlo merger trees of different minimum halo masses 
and comparing with the iV-body merger trees. In practice, 
we run the Baugh2005M model with higher resolution 
trees generated using t he new Monte Carlo prescription of 
IParkinson et al.l (|2008h and determine the halo mass down 
to which the Monte Carlo merger trees give a good match 
to the trees extracted from the Millennium simulation. 
The cold gas mass functions calculated using the iV-body 
trees and the Monte Carlo trees diverge below the mass 
indicated by the dotted vertical line in Fig. [T] at a cold 
gas mass of Mcoid = 10 8 - 5 /i _1 Mq. Note that we should 
repeat this exercise for each model in principle because 
the resolution limit may be sensitive to the model recipes. 
However, given the close agreement between the predictions 
above this mass limit, we do not expect the variation in the 
cold gas mass resolution between models to be large and so 
we expect the limiting mass obtained with Baugh2005M to 
be a reasonable estimate for all the models. 

We note that there is little evolution in the predicted 
mass functions back to z = 1. This is remarkable because 
it shows that the sources and sinks of cold gas more or less 
balance one another out. How can we understand this? We 
expect the sizes of galactic discs to decrease with increasing 
redshift. In three of the models (Bower2006a, DeLucia2006a 
and Font2008a) star formation proceeds on a timescale that 
is proportional to the circular orbit timescale in the disc, and 
so it follows that the star formation timescale decreases with 
increasing redshift. However, gas cools from the hot halo on a 
timescale that depends on local gas density; because density 
increases with increasing redshift, it follows that the cooling 
timescale also decreases with increasing redshift. Therefore 
we might expect that the amount of gas to cool per unit time 
will increase with increasing redshift but this is offset by the 
increasing numbers of stars that form per unit time with 



4 Typically this limiting mass is equivalent to ~ 20 particles, 

requirecHbrHie^lialojnas^^^ 

et al. l200ll) . 
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Figure 2. The predicted cold gas density p co i d , normalised by 
the value of the critical density p cr i t at 2=0, as a function of 
z. Different lines correspond to different models as indicated by 
the legend. For Baugh2005M, the results using the Millennium 
simulation merger trees are shown by the dashed black line. The 
dotted and dot-dashed lines show calculations using Monte Carlo 
merger trees with improved mass resolution but with the galaxy 
formation parameters held the s ame (see caption of F ig,[T]l. Filled 
and open circles correspond to Zw aan et al.l fl 997. 2005 ) dat a 
respectively; open triangles correspond to Rao & Briggs (1993); 
open and filled squares correspond to P eroux et al.l J2003f> . In the 
latter case, the open squares indicate the cold gas density inferred 
from damped Lyman a systems by Peroux et al., and the filled 
squares include a correction to take into account gas clouds with 
lower column density. 



increasing redshift. This leads competing sources (gas cool- 
ing) and sinks (star formation and mass ejection by winds) 
of cold gas to balance each other. 

This explanation is compelling in its simplicity, but 
it is far from clear that it provides the complete picture 
of what is happening. Competition between sources and 
sinks of cold gas can plausibly balance each other, but it 
is worth noting that there is little evidence for evolution 
of the cold gas mass function in Baugh2005M, which rests 
on assumptions that are quite different from those of 
Bower2006a, DeLucia2006a and Font2008a. In particular, 
the star formation timescale in Baugh2005M does not 
vary in proportion to the circular orbit timescale in the 
disc and so it is not obvious why the sources and sinks of 
cold gas should balance in this model, as they appear to. 
Understanding what physical processes drive the evolution 
of the cold gas mass function in the models is clearly an 
important issue, and one which shall be the focus of future 
work. 

We now consider the global mass density of cold gas 
f^coid = Pcoid/pcrit- In Fig. [2] we show how p co i d varies 
with redshift z, normalised by p CT it/h at 2=0 for ease of 
comparison with observational data. This reveals that both 



Bower2006a and Font2008a over-predict the density of cold 
gas at z < 1 and somewhat under-predict the amount of 
cold gas at higher redshifts. DeLucia2006a predicts a cold 
gas density that is consistent with observational estimates 
at z—0 but it under-predicts the density at z > by a 
factor of two to three. Of all the models, Baugh2005M 
most closely matches the observed density of cold gas at all 
redshifts. Comparing the predictions for this model using 
the Millennium simulation merger trees with those from 
Monte Carlo merger trees (with improved mass resolution) 
suggests that the iV-body results are robust up to z ~ 4. 



We now compare the rotation speeds of galactic discs as 
a function of cold gas mass. This is interesting to quantify 
because it indicates how the velocity width is likely to scale 
with HI mass, which is important for HI surveys. It also 
provides a useful insight into how the mean cold gas mass 
varies as a function of galaxy mass, which can be related to 
the rotation speed. 

Fig. [3] shows the rotation speed - cold gas mass rela- 
tion for galactic discs at z = (top) and z = 1 (bottom) 
predicted by the models. Note that the Durham and Mu- 
nich models define rotation speed in different ways; in the 
Durham models, the rotation speed plotted is the circular 
velocity at the half-mass radius of the disc, whereas in the 
Munich model (DeLucia2006a) , the rotation speed plotted is 
the circular velocity measured at the virial radius of the dark 
matter halo. The precise relationship between the circular 
velocities measured at the half-mass radius of the disc and 
at the virial radius of the host halo depends on the mass 
and distribution of the cold gas and stars in the disc and 
bulge and of the dark matter. For example, in Bower2006a, 
we find that the circular velocity at the half-mass radius of 
the disc is typically 20% higher than that measured at the 
virial radius of the host halo, for L* galaxies. After allow- 
ing for this difference, the DeLucia2006a rotation speed-cold 
gas mass relation is in close agreement with the predictions 
of Bower2006a and Font2008a. This level of agreement is 
quite remarkable given the differences in the implementa- 
tions of the physical ingredients in the models. In contrast, 
Baugh2005M predicts a rotation speed that is higher than 
the other models by around 50%. One possible explanation 
for this is that discs are more compact in this model, which 
is indeed the case in Fig. U (see below) . 

The model predictions diverge from each other below 
a cold gas mass of M co id = 10 8 ' 5 /i -1 M© and there is a 
change in the slope of the rotation speed - cold gas mass 
relation below this mass. This is the minimum mass down 
to which the predictions from the Millennium simulation 
merger trees are reliable. There is very little evolution in 
the rotation speed - cold gas mass relation between z = 
and 2 = 1; the zero-point of the z = 1 relation is about 25% 
higher. 



The predicted disc radius - cold gas mass relation is 
plotted in Fig. [4] For the Durham models, the disc ra- 
dius plotted corresponds to the half-mass radius of the disc, 
which is calculated by taking into account conservation of 
the angular momentum of the cooling gas and the dynami- 
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Figure 3. The predicted circular velocity - cold gas mass relation 
at 2 = (top) and z = 1 (bottom). The points shown the median 
velocity and the bars show the 10-90 percentile range. Different 
symbols correspond to different models as indicated by the legend. 
In DeLucia2006, the velocity plotted is measured at the virial 
radius of the dark matter halo; in the other cases, it is the circular 
velocity at the half-mass radius of the disc. The dotted vertical 
line indicates the cold gas mass resolution limit of the Millennium 
galaxy formation models. 

cal equilibrium of the disc, bulge and dark matter Q. For the 
Munich model (DeLucia2006a), the quantity stored in the 
Millennium Archive is three times the scale-length of the ex- 
ponential disc, which is computed by scaling from the virial 

5 SeelCole et al.l l|2000l) . lAlmeida et al.l [|2007r i and lGonzalez et al.l 
for details of this calculation. 



Figure 4. The predicted disc radius - cold gas mass relation at 
z = (top) and z = 1 (bottom). The points show the median 
velocity and the bars show the 10 th -90 th percentile range. Differ- 
ent symbols correspond to different models as indicated by the 
legend. The radius plotted is the half-mass radius of the disc. 
In DeLucia2006, the quantity stored in the Millennium Archive is 
three times the scale-length of the exponential disc, which we con- 
vert to a half-mass radius. The dotted vertical line indicates the 
cold gas mass resolution limit of the Millennium galaxy formation 
models. 



radius of the dark matter halo. We convert this length to a 
half-mass disc radius to plot on Fig. [4] 

The DeLucia2006a half- mass radii estimated in this way 
are approximately 0.2 — 0.3 dex larger than those predicted 
by Bower2006a and Font2008a. In contrast, Baugh2005M 
predicts smaller gas discs than the other models, but we 
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note that this model also predicts sizes for stellar discs that 
are in much better agreeme nt with observationa l data at 
z — than the other models l|Gonzalez et al.lf2009h . As with 
the rotation speed - cold gas mass relation, there is little 
evolution in the disc radius - cold gas mass relation between 
z — and z = 1. The model predictions diverge from one 
another below the mass resolution of Af co id = 10 8,5 /i _1 Mq. 

It is worth remarking that it is surprising that the 
DeLucia2006a, Bower2006a and Font2008a predictions are 
so close, given the differences in the way the disc sizes are 
computed in the models. One might expect the half-mass 
radii predicted by the Durham models to be smaller than 
those from the Munich model because the former take into 
account the gravitational contraction of the dark matter and 
the self-gravity of the disc and bulge, whereas the latter 
adopts a simple scaling of the virial radius of the host halo. 



4 IMPLICATIONS FOR HI SURVEYS 

The results presented so far encapsulate what current semi- 
analytical galaxy formation models can tell us about the 
distribution of cold gas masses of galaxies as a function of 
redshift. We can use this information to deduce the distri- 
bution of HI masses of galaxies, from which we can predict 
number counts of HI sources as a function of redshift. These 
predictions can then be compared with forthcoming HI sur- 
veys on the SKA pathfinders, such as ASKAP, MeerKAT 
and APERTIF, and ultimately on the SKA. 

Such a comparison represents an important and funda- 
mental test of the semi-analytical galaxy formation frame- 
work. Semi-analytical model parameters are calibrated ex- 
plicitly to reproduce statistical properties of the observed 
galaxy population where observation al data exist, such as 
the galaxy luminosity function (e.g. iBenson et al.l I2003T ) 
and the abundance of sub-mm galaxies at high redshift 
(e.g. iBaugh et al.l [2005h ■ However, this approach is some- 
times criticised precisely because it is calibrated to repro- 
duce properties of the observed galaxy population. It is not 
always clear how robust model predictions are if the param- 
eters have been adjusted to match as many observational 
datasets as possible. Few observed data exist for the HI prop- 
erties of galaxies at redshifts z > 0.05 and so such data will 
provide a compelling test of the currently favoured models 
we consider in this paper. 

In this section we use the cold gas mass functions pre- 
sented in the previous section to predict HI source number 
counts for forthcoming HI surveys. To do this, we must first 
convert cold gas masses, which are the natural outputs of the 
models, to HI masses. Then we consider how the sensitivity 
and angular resolution of a radio telescope affects whether 
or not a particular galaxy at a given redshift is likely be de- 
tected in a given HI survey. Finally we investigate the impact 
of sensitivity on the number counts of HI galaxies in "peak 
flux limited" surveys and we assess the angular resolution 
required to resolve gas-rich galaxies out to z ~ 3. 



4.1 Conversion of Cold Gas Mass to HI Mass 

The results presented in |3]are for cold gas masses in galax- 
ies, but we require HI masses. How should we convert from 
cold gas to HI mass? 



• First, we note that ~ 24% by mass of this cold gas will 
be in the form of helium; this leaves ~ 76% by mass in the 
form of hydrogen. 

• Second, we note that this ~ 76% hydrogen will be split 
into neutral (atomic, molecular) and ionised fractions, but 
for simplicity we assume that the ionised fraction in the disc 
is sufficiently small that we can ignore it. 

• Third, we must determine what fraction of the neutral 
hydrogen is molecular in form; this then allows us to assign 
an HI mass to each galaxy, given its cold gas mass. 

It is worth providing some justification for our argu- 
ment that the ionised fraction is small. Recall that we con- 
sider cold gas to be gas that has cooled radiatively from a 
hot phase to below 10 4 K and is available for star formation 
(cf. Sj2j. At a temperature of <C 10 4 K, this cold gas will in- 
clude warm h ydrogen in its atomic and ionised phases (cf. 
lFerriere|[200ll) . Observations tell us that the ratio of ionised 
to atomic hydrogen in the midplane of the Galaxy is small 
(~ 5%) but it increases with increasing scale-height, and by 
scale-hei ghts > 1 kpc the warm ionised state probably domi- 
nates fcf. lRevnold"3l2004h . The typical (i.e. full width at half 
maximum) scale-heights of atomic and molecular hydrogen 
are much smaller than this (~ 100 — 200pc), and so what 
one estimates for the ionised mass within the disc depends 
on the rang e of scale - height s included. We adopt the mass 
estimates of lFerrie"rel (|200lh for the total ionised and neu- 
tral hydrogen masses of the Galaxy (;> 7.5 x 1O 9 M0 and 
> 1.6 x 1O 9 M0) to estimate that the ionised fraction consti- 
tutes approximately 15% by mass of the Galactic disc. This 
is sufficiently small that we can ignore it for the purposes of 
this study, although more detailed modelling would need to 
take it into account. 

Of the remaining ~ 76% by mass of cold gas that is in 
the form of neutral hydrogen, what is the ratio of molecular 
(H2) to atomic (HI) hydrogen? We consider two approaches; 

• A "fixed H2/HI" ratio for all galaxies for all redshifts 
(cf. IBaugh et al.ll2004h . 

• A "variable H2/HI" ratio that depends on galaxy 
properties and redshift ( cf. lObreschkow k, Rawlinga 20091 ; 
lObreschkow et alj|2009bh . 



The fi xed H2/HI approach was used in the IBaugh et al.l 
(|200-4h study and it allows us to apply a simple uniform 
scaling to the cold gas mass functions presented in Sj3] to 
obtain HI mass functions. It is a purely empirical scaling in 
the sense that it uses estimates of the global H2 and HI den- 
sities in the local U niverse to deduce th e ratio of molecular 
to atomic hydrogen. Baugh et al.l (l2004h used the estimates 
of|K ores et all (|2003T) and IZwaan et ail (|2005h respectively 
for the global H2 and HI densities (pn 2 = (3.1 ± 0.9) x 
lO 7 /iM Mpc~ 3 and p m = (8.1 ± 1.3) x 10 7 h MoMpc -3 ) to 
deduce a ratio of molecular to atomic hydrogen of ~ 0.4. 
This gives a conversion factor of 



M m = 0.76 Af co id/(l + 0.4) ~ 0.54 M 



(1) 



which is the one we adopt. 

The variable H2/H I appr oa ch is based on the work 
of iBlitz fc Rosolowskvl (|2006h . iLerov et all (|2008h and 
lObreschkow et all l|2009ah . and it allows us to estimate the 
H2/HI ratio on a galaxy- by-galaxy basis. There have been 
various attempts to predict theoretically the variation of the 
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H2/HI ratio on galactic scales based on p hysical models of 
the ISM; for example, lElmegreer] l|l99St) argued that the 
most important physical parameters driving variations in 
the H2/HI ratio are the gas pressure and t he local intensity 
of the interstellar UV radiation field, while iKrumholz et al.l 
(2009) argued instead that the main physical parameters 
driving variations are the column density an d metallicity of 
inters tellar gas clouds. On the other hand, IWong fc Blit j 
l|2002h found from spatially resolved observations of nearby 
galaxies that the ratio of H2 to HI surface densities increases 
with increasing midplane hydrostatic gas pressure, follow- 
ing a power-law relation £h 2 /£hi oc Pq . This empirical 
power-law relation w as then confirmed in more d etailed ob- 
serva tional studies bvlBlitz fe Rosolowskvl J200rj) and Lerov 
et al. (|2008l ). and found to extend from £h 2 /£ht <g 1 to 
£h 2 /£hi 3> 1. Note that the midplane pressure Po used 
in these relations is not directly measured, but is instead 
inferred from the gas and stellar surface densities combined 
with assumed velocity dispersions or scale-heights, assuming 

hydrostatic equilibrium. 

Building on this work, lObreschkow et all l|2009af l have 
derived a model for the global H2 /HI ratio in a galaxy. This 
uses the Eh , /Shi = (Po/P*) a relation in the form found by 
iLerov et al] feoOSF ). with a = 0.8 and P, = 2.35 x 10~ 13 Pa, 
and assumes that the stars and gas in a galactic disc both 
have exponential profiles, though with different radial scale- 
lengths. After setting the free parameters of the model 
by comparison wit h observational data on nearby galaxies, 
lObreschkow et aU obtain the following expression for the 
global H2/HI ratio, = Mn 2 /M m : 

iCoi = (3-44 R c mol -°- 606 + 4.82 icr 1 - 054 ) _1 , (2) 

where -R^oi is the H2 /HI ratio at the centre of the disk, given 
by 

C„! = [K rit M gas (M gas + (U) M d * sc )] aS . (3) 

In the above expressions, MJ isc and M gas are the masses of 
stars and gas in the disc, and rdi sc is the exponential scale- 
length of the gas, while K = G/(8ttP«) = 11.3 m 4 kg~ 2 , 
and [f a ) = 0.4 is the average ratio of the vertical velocity 
dispersions of gas to stars. Given i£ ga ol , the conversion factor 
between cold gas mass and HI mass is then 

Mm = 0.76 M co id/(l + flgSi). (4) 

We employ Eq. [2l Eq.[3]and Eq. [4] using the values of M^ isc 
and Mgas predicted by the semi-analytical models. Because 
neither the Durham nor Munich models distinguish between 
the half-mass radii of the stars and gas, we take the gas 
half-mass radius to be equal to the total half-mass radius 
predicted by the semi-analytical model, and we then convert 
it to a disc scale-length rd isc by assuming an exponential disc 
(as infObreschk ow et all ). 

How significant is the difference between the HI masses 
estimated assuming a variable H2 /HI approach and those es- 
timated assuming a fixed H2/HI approach? Fig. [S] shows the 
distribution of the ratio Mh 2 /Mhi (i.e. R^oi) f° r galaxies at 
2=0 in Bower2006a. This is interesting because R^oi nxes 
Mhi for a galaxy, given M co id (cf. Eq. Q, and so knowledge 
of the distribution of iSjfi provides important information 
about the distribution of Mhi- We split our galaxy sample 
by M C oid into three mass bins, of width 0.3 dex, centred 



Bower2006a z = 
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Figure 5. Distribution of the ratio Mh 2 /Mhi (K^) predicted 
by the variable H2/HI approach for galaxies with cold gas masses 
M co id=10 9 /10 9 ' 5 /10 10 M Q (solid, dotted and dashed histograms 
respectively) in Bower2006a at 2=0. We include all galaxies 
within a bin of 0.3 dex centred on M co id. The histograms are 
normalised by the area under the curve. The light dotted vertical 
line indicates the ratio assumed if the fixed H2/HI approach is 
used. 

on M co i d = 10 9 ,10 9 5 and 10 10 M (solid, dotted and dashed 
histograms); these contain 55042, 60579 and 32213 galaxies 
respectively. For each mass bin, we construct a histogram 
of Mh 2 /Mhi estimated using Eq. [5] and we normalise it by 
the area under the curve. For comparison, we indicate also 
the ratio Mh 2 /Mhi one obtains assuming a fixed H2/HI ap- 
proach by the light dotted vertical line. 

Fig. [5] is striking because it shows that variable H2 /HI 
approach predicts a broad distribution of Mh 2 /Mhi in 
each mass bin. The medians of the distributions are ~ 
0.9, ~ 1.5 and ~ 2.2 in the bins centred on M co id = 
(10 9 /10 9 5 /10 10 ) M Q ), compared to ~ 0.4 assumed in the 
fixed H2/HI approach; indeed, only ~ 11% of the galax- 
ies m the M co id=10 9 /10 915 M mass bins and ~ 5% in the 
-M co id=10 9 Mq mass bin have Mh 2 /Mhi ratios as small as 
this. Physically, this means that a typical galaxy will have 
a smaller fraction of its cold gas mass in the form of HI - 
by as much as a factor of ~ 100 - in the variable H2/HI 
approach than in the fixed H2/HI approach. Inspection of 
Eq. [3] suggests that this behaviour reflects the strong scal- 
ing with rdisc (oc rd isc ). Within a given mass bin there is 
a distribution of rdi sc and the 10 th and 90 th percentiles of 
this distribution differ by a factor of at least a few with re- 
spect to the median, translating into variations of factors 
of ~ 20 — 100 in surface density and consequently local gas 
pressure with respect to the median. This implies that the 
Mh 2 /Mhi ratio can in principle vary by factors of ~ 10 2 to 
10 4 within a given mass bin, which will affect the number of 
HI sources that can be detected. 

The dependence on rdi sc in the variable H2/HI approach 
implies that the mean/median Mh 2 /Mhi ratio should in- 
crease sharply with increasing redshift, which in turn im- 
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Figure 6. Variation of the median Mn 2 /Mm (i.e. of all 

galaxies with Af co id > 10 8 5 Mq with redshift. Different lines cor- 
respond to different models as indicated by the legend. 



plies that HI masses of galaxies should be smaller at high 
redshifts. This is made clear in Fig. [6] which shows how 
the median of the distribution of Mh 2 /Mhi for all galaxies 
with M co id ^ 10 s ' 5 Mq varies with redshift for the four mod- 
els. In all cases, the median increases with redshift, as we 
would expect. However, the most striking aspect of this fig- 
ure is the pronounced offsets between different models. The 
DeLucia2006a medians are an order of magnitude smaller 
than the Bower2006a/Font2008a medians at all redshifts. 
In contrast, Baugh2005M predicts a median that is a fac- 
tor of a few larger at 2=0 than the Bower2006a/Font2008a 
medians, but the difference grows to a factor of ~ 10 by 
z ~ 3. The models predict broadly similar cold gas and 
stellar masses and so it is the differences in scale- lengths, 
apparent in Fig. [4] and the strong scaling with disc scale- 
length (oc r^ iac ) that drive these large offsets. The net effect 
of this strong variation of Mh 2 /Mm with redshift will be to 
dramatically reduce the number of HI sources detectable at 
higher redshift (see Fig. [8] in the next section). 

This demonstrates that how one chooses to calculate a 
galaxy's HI mass is important. However, for the purposes of 
this study, we adopt the fixed H2 /HI approach to converting 
cold gas masses to HI masses. The variable H2/HI approach 
has been calibrated using observations of galaxies in the lo- 
cal Universe. There are sound physical reasons to expect 
that there will be a correlation between local gas pressure 
and molecular fraction in galactic discs at all redshifts, but 
it is unclear how reliable the local correlation is likely to be 
when applied to high redshifts. In contrast, the fixed H2/HI 
approach provides a reasonable upper bound to the number 
of sources we might expect to detect. 



4.2 Detection of HI Sources 

Two issues are key in determining whether or not an HI 
source will be detected reliably by a radio telescope or in- 



terferometer; namely, sensitivity and angular resolution. An 
HI source has an intrinsic 21-cm luminosity that depends 
primarily on its HI mass Mm which, along with its line-of- 
sight velocity width AVios and distance D, determines the 
flux at the position of the observer. The observer measures 
this flux with a receiver that has finite sensitivity determined 
primarily by its effective collecting area A c s and the system 
temperature T sys , and it is this limiting sensitivity that de- 
termines whether or not the source is detected. Note, how- 
ever, that angular resolution also plays an important role; 
HI 21-cm emission from a galaxy is likely to be spatially ex- 
tended and an extended source can be "resolved out" by an 
interferometer if it is observed with too high an angular res- 
olution. In the next two sections we consider how sensitivity 
and angular resolution affect HI source number counts. 



4-2.1 Sensitivity 

If we could construct the ideal radio telescope with arbi- 
trarily high sensitivity, then we would observe a flux S^ba 
from an HI source at redshift z. This is determined by the 
source's HI mass Mm, its velocity width AVi os and redshift 
z. The relationship between S b s and Mm, AVi os and z can 
be obtained as follows. 

The emissivity e VQ at rest-frame frequency vo tells us 
the rate at which energy is emitted by an HI source at this 
frequency per unit volume per steradian. We can express 
this as 



1 712 

tu = — I1V0A12 — naMvo) 
47T riH 



(5) 



where hvo is the energy of the 21-cm photon (h is Planck's 
constant and vq is the photon frequency), n2/nu tells us 
what fraction of atoms are expected to be in the upper state, 
A12 is the Einstein coefficient which tells us the spontaneous 
rate of the transition from the upper to lower state, nu is 
the total number density of hydrogen atoms in the source 
and <j>{v) is the line profile. We expect n^/nn — 3/4 be- 
cause the temperature of the cloud corresponds to a much 
larger energy than the energy difference corresponding to the 
transition fro m the upper to lower state (i.e. kT 3> hv, cf. 
ISpitzer|[l97sh . Integrating over a solid angle of 4-7T steradians 
and over the volume of the source gives us the luminosity at 
frequency Vq, L uo , 



L »o = ^hvoA^—^-Mvo), 

4 77lH 



(6) 



where we write the number of hydrogen atoms as Mhi/jtih, 
jtih being the mass of the hydrogen atom. 

When we observe 21-cm emission from an HI source, 
the radiation arises from a forbidden transition, which im- 
plies a small natural line width (5 x 10 16 Hz). Therefore 
the observed line profile 4>{yo) is in practice determined 
by Doppler broadening due to the motions of HI atoms 
in the galaxy, which, in disc galaxies, are dominated by 
the large-scale rotational velocity. We therefore assume that 
4>(yo) can be approximated as a top hat function of width 
Ai/o = (AV[ os /c)fo and height 1/A.Vq. Noting this, we can 
write the total monochromatic flux at the position of the 
observer as 



S„ = (1 + z) 



47tD l (z) 2 



(7) 
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here v = Vq (1 + z) -1 is the redshifted frequency measured 
by the observer and Di J (z) = (1 + z) D co (z) is the luminosity 
distance of the source with respect to the observer (D co (z) 
is the radial comoving separation between source and ob- 
server). Therefore the measured flux at the position of the 
observer is 

3 hvAi2 



Sobs Au 

107T TTlH 

which we rewrite as 
3 hcA\i 



1 

D L {z) 



Jobs 



16tt m H "" D L (z) 2 AVio 
Here we assume that Ai> in Eq. [8] can be written as 
Au AV los 



Av = 



(8) 



(9) 



(10) 



1+z c 1+z 

where AVi os is the rest-frame line-of-sight velocity width 
of the galaxy. AVi os will depend on the inclination i of 
the disc, varying as 2 V c sin i where V c is the disc circular 
velocity, and i — or tt/2 correspond to a face-on or 
edge-on disc respectively. In our analysis we assume that 
galaxy discs have random inclinations with respect to the 
observer, with an average velocity width of ~ 1.5714 (cf. 



The measured flux from the source must be compared 
with the intrinsic limiting sensitivity of the receiver. Assum- 
ing a dual polarisation radio receiver, the limiting root mean 
square flux j rms can be calculated in a straightforward man- 
ner (cf. iBurke fc Graham- Smithlll996j ): 



Srms — 



2fc s T s , 



(11) 



A e f[y/2 Av lec r 

where A e s is the total (effective) collecting area of the tele- 
scope, Tsys is the system temperature, Av lec is the band- 
width used in the receiver, r is the integration time and fcs 
is Boltzmann's constant. The effective area A e s and system 
temperature T sys are the key parameters. The SKA will have 
an effective are43 of order A c g — 1 km 2 and its pathfinders 
will have effective areas of a percentage of this; for pathfind- 
ers such as ASKAP, MeerKAT and APERTIF, this percent- 
age will be < 1%. A conservative estimate of the system 
temperature would be T sys = 50K. We can rewrite Eq. ()11|) 
as 

(hr) <^ 



iSri 



- ( Actl \ 1 ( Tsys \ ( Ai/rcc 

1.626/iJy ~~ VkmV V 50K / V MHz 



where Uy = 10 _26 Wm _2 Hz _1 . The limiting flux sensitivity 
of the telescope Su m is then S\i m = n a S rms , where n a defines 
the threshold for a galaxy to be reliably detected. Once we 
have fixed the integration time r and the bandwidth Av ICC , 
we have the limiting sensitivity of our radio telescope. 

It is worth remarking on the relationship between the 
frequency bandwidth Av TCC in Eq. [11] and Eq. 1121 which is 
particular to the radio telescope, and the frequency width 
Av in Eq. [8] and Eq. [9] which is set by the velocity width 
of the HI line AVi os . If we are to maximise the signal-to- 
noise ratio for detecting the galaxy in a survey, then it is 



6 Despite its name, it is unlikely that the SKA will have an area 
of 1 km 2 ; instead it is likely to be ~ 0.5km 2 , which ensures greater 
survey speed at the expense of sensitivity. 



important that these bandwidths are matched. The overall 
telescope frequency bandwidth at a given frequency will be 
broad - typically > 100 MHz - and much greater than the 
velocity width of an individual galaxy (e.g. ~ 1 MHz corre- 
sponds to a ~200kms -1 galaxy), but the overall bandwidth 
consists of ~ 10 3 to 10 4 frequency channels that are much 
narrower than the expected frequency width of a galaxy. 
Therefore a single telescope pointing will produce a huge 
data cube centred on a frequency v with an overall band- 
width that consists of thousands of narrower frequency chan- 
nels Sv. These channels will then be re-binned to produce 
data cubes with different frequency resolutions Av TCC , and 
one of these re-binnings will have Av lec ~ Av, which will be 
optimal for detecting an individual galaxy of a given velocity 
width with a sufficiently high signal-to-noise. 



4-2.2 Angular Resolution 

The angular resolution of the radio telescope becomes im- 
portant when the HI source is extended rather than a point 
source. For a single dish telescope the angular resolution is 
~ X/D, where A is the wavelength of the radiation and D 
is the diameter of the dish. Sources with angular sizes 6 
smaller than this are indistinguishable from point sources. 
For radio interferometers it is the lengths of the baselines 
between pairs of dishes B rather than the diameters of in- 
dividual dishes that dictate the angular resolution. If the 
longest baseline is B m ax and the shortest is B m in, then 
the interferometer will resolve angular scales roughly from 
Omin = A/B max to Omax = A/B min . Interferometers can 
therefore provide higher angular resolution than a single 
dish, which is desirable because it allows for HI sources to be 
mapped in greater detail. However, sources more extended 
than ~ ©max get resolved out, and have their fluxes under- 
estimated. The fraction of a galaxy's flux that is resolved 
out will depend on, for example, the precise distribution of 
interferometer baselines and what one assumes for the sur- 
face brightne ss profile of the gala xy (see, for example, the 
discussion in lAbdalla et"ai1l2009l ). In the case of the SKA, 
the shortest baseline is expected to be 20m, which corre- 
sponds to an angular resolution of O m ax ~ 2100" at A=21 
cm, while the maximum baseline will be > 3000km, which 
corresponds to a resolution O m in ~0.1". As we will see in 
£|4 .31 the predicted HI sizes of galaxies in cosmological sur- 
veys are typically of order arcsecs or smaller, so there should 
not be any problem in practice with galaxies being resolved 
out. 



4.3 Predictions for Observables 

First, we examine the predictions of the four models for the 
number counts dN/dz of HI galaxies per square degree of 
sky as a function of redshift. By number counts, we mean the 
number of HI sources diV that can be detected in a redshift 
interval dz centred on a redshift z, 



dN 
~dz 



dV 

dz 



dn 
dM 



f(M) dM. 



(13) 



where dV/dz is the cosmological volume element at redshift 
z, dn/dM is the HI mass function at z and f(M) represents 
the fraction of galaxies with HI mass M that can be detected 
by the radio telescope. For simplicity we assume that f(M) 
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Figure 7. Number counts of galaxies per square degree per unit redshift for a telescope with 100% (top left and right), 10% (bottom 
left) and 1% (bottom right) of the effective area of a fiducial SKA (A e ff=l km 2 ), for a deep peak flux limited hemispheric HI survey 
lasting 1 year. We consider two integration times; r=12 hours, appropriate for a field of view of 30 square degrees, which is typical of 
pathfinders such as ASKAP and MeerKAT, and r=80 hours, appropriate for a field of view of 200 square degrees, which is the maximum 
field of view anticipated for the full SKA. Only sources that satisfy 5 D b s ^ n,rSrms with n CT =10 are included. Note that we do not include 
galaxies with cold gas masses below the resolution limit of M co \^ = 10 s,5 h~ 1 Mq. In all of the panels we show how the counts change in 
Bower2006a if we assume a "No Evolution" case in which the mass function predicted for 2 = applies at all redshifts. 



depends only on limiting sensitivity, which depends on Mhi- 
The angular resolution of the telescope also plays a role but 
its influence on f(M) requires further assumptions to be 
made about, for example, the distribution of baselines, the 
dumpiness of HI within galaxy, its surface density profile, 
etc... and so we ignore this dependence. 



fiducial skaO. We make the simplifying assumption that 
the field of view is fixed with redshift and consider two cases 
- 2 00 square de grees, which could be achieved on the SKA 
(cf. lTavlorl200ci) and 30 square de grees, which is expected on 
ASKAP (cf. I Johnston et atll200&j) . Assuming that the survey 
covers a complete hemisphere (i.e. 2n sr.), this gives effective 
integration times on a patch of sky of r = 80 and r = 12 



In estimating predicted number counts, we assume a 
peak flux limited survey lasting 1 year on a radio telescope 
with an effective area A e g of 1%, 10% and 100% of the 



7 Assuming that the SKA has an effective area of 1 km 2 , although 
as noted already, the final SKA is likely to have an effective area 
smaller than this. 
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hours respectively. The measured flux S b B from a galaxy is 
estimated using Eq. [9] The velocity width AVi os is taken to 
be 2V Cj haif sini, where V c ,haif is the circular velocity at the 
half-mass radius of the galaxy. Galaxies are given random 
inclinations i, drawn from a uniform distribution in cosi. 
The measured flux is compared to the limiting flux S rms 
(Eg. on a galaxy-by-galaxy basis (assuming Ai/ rcc = Av 
and using Eq. [10] to estimate Av) to estimate the signal-to- 
noise ratio. Our criterion for detection is S b B /S Tln s 1? n a = 
10. 

Fig. [JJ shows how the number counts of HI galaxies 
varies with redshift for surveys with A e R of 100% (top left 
panel for an integration of 80 hours, top right panel for 
an integration time of 12 hours), 10% (bottom left panel) 
and 1% (bottom right panel) the effective area of the fidu- 
cial SKA (Aes= 1km 2 ). Acs is crucial in determining how 
many galaxies can be detected and the range of redshifts 
that can be probed. We find the number counts peak at 
~ 4 x 10 3 /4 x 10 4 /3 x 10 5 galaxies per square degree at 
z~ 0.1/0.2/0.5 for a year long HI hemispheric survey on a 
1%/10%/100% SKA with a 30 square degree field of view, 
corresponding to an integration time of 12 hours. On a full 
SKA with a 200 square degree field of view (equivalent to 
an integration time of 80 hours) the number counts peak at 
5 x 10 5 galaxies per square degree at z~ 0.6. 

A couple of interesting trends are immediately ap- 
parent in this figure. The first is that DeLucia2006a, 
Bower2006a and Font2008a, which all incorporate a form 
of AGN feedback, all predict broadly similar number 
counts out to z ~ 3. There are differences in the details 
that reflect differences between the models that can be 
readily inferred from the mass functions shown in Fig. [JJ 
For example, DeLucia2006a predicts enhanced number 
counts at lower redshifts and depressed number counts at 
intermediate to high redshifts with respect to Bower2006a 
and Font2008a. Here lower, intermediate and higher are 
defined relative to the redshift at which the number counts 
peak - approximately z ~0.5, 1.0 and 1.5 for 1%, 10% 
and 100% the effective area of the SKA respectively. The 
second is that Baugh2005M consistently predicts many 
more gas-rich galaxies than the other three models. There 
are several reasons for this: (1) Baugh2005M incorporates 
galactic super-winds rather than AGN feedback, which 
affects the cooling rate in massive haloes; (2) it uses weaker 
supernovae feedback than the other models; and (3) the 
star formation timescale in galactic discs does not scale 
with the disc dynamical time in Baugh2005M, whereas it 
does in the other models. 

So far we have neglected the important issue of com- 
pleteness of the number counts. As explained in ^ the fi- 
nite mass resolution of the Millennium simulation implies 
that there is a minimum cold gas mass - and therefore a 
minimum HI mass - that can be reliably resolved in the 
Millennium galaxy formation models. For our assumed cold 
gas mass limit of 1O 8 ' 5 M0, this implies a HI mass limit of 
~ 1O S ' 2 M0 (assuming a fixed H2/HI approach to converting 
from cold gas mass to HI mass). The sensitivity of a sur- 
vey may be such that galaxies with HI masses below this 
lower mass limit can be detected, and so we expect that the 
number counts of HI sources will be underestimated below 
some redshift 2j nc . This is because the cold gas mass func- 



tion has not converged at low masses and so the population 
of sources is incomplete. As the sensitivity of a survey in- 
creases, so too does Zinc because the survey probes the HI 
mass function down to lower masses and the effect of this 
incompleteness will be in evidence at higher redshifts. 

We have assessed the issue of completeness for one 
of the Durham models (Bower2006a) using three sets of 
merger trees of increasing resolution (1, 2 and 4 times 
the Millennium galaxy formation model resolution; MCI, 
MC2 and MC3 respectively) generated using the Monte 
Carlo prescription described in ij3] We considered surveys 
with effective areas of 1%/10%/100% of the SKA and 
integration times of 12 and 80 hours. As we would expect, 
the number counts predicted using the TV-body merger 
trees and the MCI merger trees are consistent with one 
another; reassuringly, the same holds true for the MC2 and 
MC3 merger trees, for all of the survey sensitivities that we 
considered. This suggests that the number counts we obtain 
for Bower2006a with the TV-body trees are converged and 
the peak redshifts and number counts we find in Fig. [JJ are 
robust. We have also estimated the redshifts at which the 
limiting survey sensitivity (and consequently minimum HI 
mass detectable; see Eq. Hip is comparable to the limiting 
HI mass and we find that z inc < 0.04 for a 1%/10% SKA, 
and zinc — 0.08 for a 100% SKA. These estimates confirm 
that incompleteness effects are unlikely to affect the shape 
and amplitude of the peak of the number counts shown in 
Fig. [JJ for Bower2006Cl 

Fig. [7J is based on the fixed H2 /HI approach to esti- 
mating HI masses from cold gas masses, but it is interesting 
to ask how the number counts would change if we used the 
variable H2/HI approach instead. In Fig. [5] we compare the 
number counts of sources in Bower2006a estimated using 
the fixed H2/HI approach (solid curves) and the variable 
H2/HI approach (dashed curves). In the upper panel we 
show the case for a 100% SKA with an integration time 
of r = 80hrs and in the lower panel we show the result 
for a 1% SKA with an integration time of r = 12hrs. 
As discussed in &I4.1I we expect the number of sources 
to be systematically lower at higher redshifts if we adopt 
a variable rather than a fixed H2/HI ratio and this is 
confirmed by Fig. [8] The peak number of sources is lower in 
the variable approach and the number counts decline more 
sharply with increasing redshift. The difference is a factor 
of 10 for the 100% (1%) of the SKA by 2=3 (1). 

In Fig. [JJ we noted that the cold gas mass function 
does not appear to evolve strongly with redshift between 
^ z 1. It is therefore interesting to ask how approximat- 
ing the mass function at z by the mass function predicted 
at 2=0 impacts on the number counts of sources. In each of 
the panels in Fig. [7J the solid short dashed curves highlight 
the impact making such an approximation - showing the 

8 It is worth remarking that the precise value of Zi nc will vary 
from model to model because each model predicts slightly differ- 
ent shapes for the HI mass function with decreasing mass, and 
so we should repeat this exercise for each model in principle. 
However, the consistency of Bower2006a with the other models 
suggests that conclusions about completeness based on this model 
are sufficiently general to be applied to the other models. 
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It's useful to comp are our results with p r evious work, 
and so we note that lAbdalla fc Rawlingsl ([2005) have 
predicted the redshift variation of dN/dz for a full SKA 
using semi-empirical models for the HI mass function. 
This is interesting because we can compare predictions 
based on semi-analytical models with their predictions 
based on semi-empirical models. For a survey with an 
integration time of 4 hours on a full SKA, they expect 
to detect ~ 8 x 10 4 sources at the peak dN/dz; this 
peak occurs at z ~ 0.6. Based on a similar integration 
time, the semi-analytical models predict peak dA/dz's of 
between ~ 8 x 10 4 (Bower2006a,DeLucia2006a,Font2008) 
and ~ 2 x 10 5 (Baugh2005M). All of these models peak 
at z ~ 0.5. The semi-analytical models predict dN/dz's 
that decline more gently with increasing z than the semi- 
empirical models but this reflects in part differences in the 
model assumptions (e.g. the assumption that HI, baryons 
and dark matter follow similar mass functions) and in part 
the conversion from cold gas to HI mass that we must 
assume. 
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Figure 8. Number counts of galaxies per square degree per unit 
redshift for a telescope with 100% (top) and 1% (bottom) of the 
effective area of a fiducial SKA (1 km 2 ), for a peak flux limited HI 
survey lasting 1 year, based on Bower2006a. The integration times 
are t=80 hrs and 12 hrs respectively. As before, only sources that 
satisfy ,S oba > Tifj Srms with n& — 10 are included and we ignore 
galaxies with cold gas masses below the resolution limit. The solid 
and dashed curves correspond to the (fiducial) fixed H2/HI and 
variable H2 /HI approaches to estimating the cold gas mass to HI 
mass conversion factor. 

behaviour of the Bower2006aNoEvol approximation, in 
which the HI mass function at a given z is replaced by the 
HI mass function predicted by Bower2006a at 2=0. For the 
10% and 100% SKA this would appear to be a reasonable 
approximation over the redshift range <C z <; 1.5, differing 
by ~ 10% at most. At z > 1.5 the predicted number counts 
diverge, the degree of the discrepancy depending on the 
sensitivity of the survey. 



Second, we focus on the angular sizes of galaxies. Fig. [9] 
and Fig. [10] show how the angular size of HI galaxies 
varies with redshift. We compute the angular diameter as 
9 = 2i?h/£ ) an g where Rh is the half- mass radius of the 
galaxy and D ang (z) = (1 + z)~ x D co (z) is the angular di- 
ameter distance of the galaxy with respect to the observer, 
where, as before, D co (z) is the radial comoving separation 
between source and observer. The points correspond to the 
median angular diameters while the upper and lower error- 
bars indicate the 25 th and 75 th percentiles of the angular 
diameter distributions at that redshift. Points are given hor- 
izontal offsets of 0.025 in redshift to aid clarity. In Fig. [5] we 
plot the redshift dependence of the median angular diam- 
eters of galaxies with HI masses Mm 1O 9 /i _1 M0 (upper 
panel) and Mm ^ lO lo /i _1 M0 (lower panel) varies with 
redshift out to z <C 3. In Fig. [10] we focus on the variation 
predicted by Baugh2005 and DeLucia2006a for galaxies with 
Mm Z 10 9 A- 1 M S over the redshift interval ^ z ^ 1. 



Knowledge of the expected redshift variation of angu- 
lar diameter of an extended HI source is useful because it 
allows one to estimate the number of sources that are likely 
to be resolved out. Fig. [9] and Fig. \W\ distill the informa- 
tion presented in Fig. [3] where we showed how the half- 
mass radii of galaxies varied with cold gas mass at a given 
redshift. The models indicate that galaxies that have larger 
HI masses tend to have larger half-mass radii, which corre- 
sponds to more extended angular diameters at a given red- 
shift. The angular diameter decreases sharply between 2=0 
and z ~ 0.5, and more gently at z > 0.5. The median an- 
gular size varies between 5" and 10" at jz=0.1, 1" and 3" at 
2=1 and 1" and 3" at 2=3 for galaxies with HI masses in ex- 
cess of 1O 9 /i _1 M0; the upper and lower bounds correspond 
to the predictions from DeLucia2006a and Baugh2005M. 
Therefore, to resolve a typical galaxy with an HI mass of 
Mm > 1O 9 /i -1 M0 at z ~ 1 requires a maximum baseline of 
order 100 km. 
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Figur e 10. The predicted redshift variation of the angular diam- 
eter qf gala des with HI masses Mm > 10 9 /i — 'Mq at Z < 1 in 
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Figure 9. The predicted redshift variation of the angular diam- 
eter of galaxies with HI masses Mjji > 10 9 /i — 'Mq (upper panel) 
and Mjji > 10 10 h~ j Mq (lower panel). Different symbols corre- 
spond to different models, as indicated by the legend. 



5 SUMMARY 

Neutral atomic hydrogen is the fundamental baryonic build- 
ing block of galaxies and understanding how its abundance 
varies over cosmic time will provide us with important in- 
sights into galaxy formation. Few observational data exist 
for the abundance of neutral hydro gen at redshifts z > 0.05, 
the extent of the HIPASS survey (iVlever et al.ll2004r i. but 
this will change with the advent of the Square Kilometre 
Array, which will see first light by about 2020. The SKA 
will transform cosmology and galaxy formation (e.g. Blake 
et al. 12004 lBraunll2007l) . allowing us to probe the cosmic 
HI distribution out to redshifts z ~ 3. In the meantime, HI 
surveys on SKA pathfinders such as ASKAP (Johnston et 
al. l2008l ). MeerKAT |jonasll2007ri and APERTIF (Verheijen 



et al. 120081) will provide us with important initial glimpses 
into the cosmic HI distribution out to redshifts z ~ 1. Be- 
cause we have such few observational data for the cosmic HI 
distribution beyond z ~ 0.05, the coming decade promises 
to provide powerful tests of the predictions of theoretical 
galaxy formation models. It is therefore timely to ask what 
galaxy formation models tell us about the abundance of neu- 
tral hydrogen in galaxies. 

In this paper we have investigated four of the currently 
favour e d galaxy formation models - those of iBaugh et al.l 
j2005l),lBower et alj (|2006l ). lDe Lucia fc BlaizotJ (|2007h and 
I Font et aT ( 20081 ) - and determined what they predict for the 
mass function of cold gas in galaxies and how it evolves with 
redshift. Each of the models u se merger trees deriv ed from 
the Millennium simulation (cf. ISpringel et al1l2005l) and so 
any differences between the model predictions reflect intrin- 
sic differences in the physics incorporated into the models 
themselves. Three of the models (Baugh2005M, Bower2006a 
and Font2008a ) use the Durh am semi-analytical code GAL- 
FORM (cf. ICole et all l2000h whereas the fourth (DeLu- 
cia2006a) uses the Munich semi-analytical code. Arguably 
the most important difference between the models is in the 
precise treatment of feedback; Bower2006a, DeLucia2006a 
and Font2008a all incorporate a form of AGN feedback 
whereas Baugh2005M does not, instead favouring galactic 
super-winds. 

Interestingly we find that the model predictions are 
broadly consistent with one another. Differences between 
the models reflect (1) the use of AGN heating to suppress 
gas cooling in massive haloes (which is used in Bower2006a, 
Font2008a and DeLucia2006a but not in Baugh2005M, 
which invokes supernovae-driven super- winds); (2) the 
strength of supernovae feedback, which is weakest in 
Baugh2005M; and (3) the scaling (or lack of) of the star 
formation timescale in galactic discs with the disc dynami- 



Redshift Evolution of Cold Gas Mass Function 17 



cal time (scaling is assumed in Bower2006a, DeLucia2006a 
and Font2008a whereas it is not in Baugh2005M) . 

We have focused on three particular aspects of the cold 
gas properties of galaxies, namely (i) the mass function 
of cold gas in galaxies and the relationship between (ii) a 
galaxy's cold gas mass and its half-mass radius and (iii) its 
cold gas mass and rotation speed (i.e. circular velocity) at 
this radius. 

Mass function of cold gas in galaxies: The predictions of 
Font2008a and Bower2006a are generally very similar, with 
differences only apparent at small cold gas masses. This 
is unsurprising because Font2008a descends directly from 
Bower2006a, the principal difference between the models 
being the improved treatment of gas stripping by the hot 
intra-cluster medium in Font2008a. At 2=0 we find that 
Bower2006 and Font2008 systematically over-predict the 
numbers of galaxies with HI masses in excess of lO 9 ft" 2 M 
when compar ed to the observed m ass function derived from 
HIPASS (cf. IZwaan et al.l 120051) . while Baugh2005M and 
DeLucia2006a provide reasonable descriptions (i.e. in terms 
of shape and amplitude) of the observed mass function. In- 
terestingly we find that the cold gas mass function shows 
little evolution out to redshifts of z ~ 3 in all four models. 

Relationship between cold gas mass and half-mass radius, 
rotation speed: At fixed cold gas mass, both Bower2006a 
and Font2008a predict half-mass radii and rotation speeds 
(circular velocities measured at these half-mass radii) that 
are in excellent agreement with each other, as we might ex- 
pect. Half-mass radii are slightly but systematically larger 
(by ~ 25%) in DeLucia2006a than in Bower2006a and 
Font2008a, but the three models predict similar rotation 
speeds (to within the width of the distribution). This level 
of agreement is remarkable, given the number of subtle (and 
some not so subtle) differences between the frameworks un- 
derpinning DeLucia2006a and Bower2006a/Font2008a (de- 
scribed in |J2J . In contrast, Baugh2005M predicts half-mass 
radii that are systematically smaller (~ 60%) and rota- 
tion speeds that are systematically larger (by ~ 20%) at 
fixed cold gas mass than in Bower2006a, DeLucia2006a and 
Font2008a. It is worth noting that Baugh2005M predicts a 
size-luminosity relation for late-type galaxies that is in very 
good agreement with SDSS data, whereas the agreement 
between Bower2006a (and by extension Font2008a) and the 
observational data is poor (cf. IGonzalez et alj|2009h . 

We took the predicted mass functions of cold gas in 
galaxies and used them to derive number counts of HI galax- 
ies for future all-sky HI surveys. Rather than adopting a 
specific design, we considered surveys carried out on radio 
telescopes with effective collecting areas A e g that are per- 
centages of a fiducial Square Kilometre Array, with an ef- 
fective collecting area of 1km 2 . We focused on surveys with 
A aS of 1%, 10% and 100% of the SKA and assumed that 
these surveys lasted for 1 year, with integration times of be- 
tween 12 and 80 hours within individual fields of view. As we 
pointed out in £14,2.11 A c g plays a crucial role in determin- 
ing the sensitivity Stma of a radio telescope, which in turn 
dictates how many HI galaxies are likely to be detectable 
by the survey. SKA pathfinders such as ASKAP, MeerKAT 
and APERTIF will have effective areas of order ~ 1%. 

We examined two possible approaches to converting 



cold gas masses to HI masses. The first simply assumed that 
the ratio of molecular-to-atomic h ydrogen (H2/HI) is fixed 
for all galaxies at all redshifts (cf. iBaugh et al.l l2004f). The 
second assumed that the H2/HI ratio is variable, depend- 
in g on individual gal a xy pro perties according to the model 
of lObreschkow et all (|2009aT l, which in turn is based on an 
empirical relation between the ratio of the surface densities 
of H2 to HI and the gas pr essur e found for local ga laxies by 
iBlitz fc Rosolowskvl ((2009 ) and iLerov et~afl (|2008h . This is 
an important consideration because how one converts from 
cold gas mass to HI mass will determine the 21-cm luminos- 
ity of a galaxy and therefore its detectability in an HI survey 
of a given sensitivity. We computed the observed flux S'obs 
for each galaxy using both its HI mass and its circular veloc- 
ity at the half-mass radius to define its velocity width and 
required that Sobs lOSrms for the galaxy to be detected. 

As for the cold gas mass functions, we find that the 
models that include a form of AGN feedback predict broadly 
similar number counts; Baugh2005M predicts many more 
gas rich galaxies, as many as a factor of ~ 2 -3 more at the 
redshift at which the number counts peak. The choice of cold 
gas to HI mass conversion factor is also very important, es- 
pecially at higher redshifts; adopting a variable H2/HI ratio 
predicts that galaxies should be predominantly molecular 
rather than atomic hydrogen at high redshifts, and this has 
a profound impact on the number of HI sources one predicts 
(see Fig. [H|. Clearly more work is needed to put this on a 
more secure theoretical footing. Interestingly we find that 
approximating the HI mass function at 2 <C 2 by the z=0 HI 
mass function has little impact on the number counts one 
might expect to measure. 

In addition, we estimated the dependence of the median 
angular diameter of HI galaxies on redshift, for galaxies 
with HI masses M H i > 10 9 /i _1 M Q and M m > 10 10 /i _1 M s . 
This is useful to know because it allows one to estimate 
the fraction of the flux that is likely to be lost because it 
has been resolved out. The models indicate that galaxies 
with larger HI masses tend to have larger half-mass radii 
and therefore more extended angular diameters at a given 
redshift. We found that the angular diameter decreases 
sharply between <C z <C 0.5 and more gently at 2 ^ 0.5. 
The median angular size varies between 5" and 10" at 
2=0.1, 0.5" and 3" at 2=1 and 0.2" and 1" at 2=3 for 
galaxies with HI masses in excess of 10 9 /i _1 Mq, where the 
lower and upper limits correspond to the predictions of 
Baugh2005M and DeLucia2006a. 

We have concentrated on the most straightforward mea- 
surement one can make in future HI surveys, namely the 
number counts of galaxies. However, we have considered only 
global counts - we have not considered how the HI mass 
functions and number counts might depend on local envi- 
ronment. Certainly there is good reason to expect that en- 
vironment should play a role in shaping the HI mass function 
of galaxies. For example, we might expect that the amount 
of HI in a galaxy will be reduced by ram pressure stripping 
as it falls through a dense intra-cluster medium; this would 
become apparent as a quenching of the star formation (e.g. 
iBalogh et all l2000l iQuilis et al]|200Ch . but it should also 
be evident in an environmental dependence of a galaxy's HI 
properties. Indeed, there is some observational evidence to 
suggest that the HI mass function does depend on environ- 
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men t (e.g. IZwaan et al.ll2005l; ISpringob et aT|l2008l; Kilborn 
et al.T2009); for example. iKilborn et aL I l|2009l) find evidence 
that the slope of the low-mass end of the HI mass function 
in galaxy groups decreases with decreasing HI mass, in con- 
trast to the global HI mass function found in HIPASS by 
IZwaan et aTlfeoOSh . 

These results are interesting because they suggest that 
environment plays an important role in determining the HI 
properties of galaxies. In forthcoming papers we will explore 
precisely what galaxy formation models predict for the clus- 
tering of cold gas (Kim et al., in preparation) and we will 
explore precisely what role environment plays in shaping a 
galaxy's cold gas - and consequently HI - properties. 
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